Processing Serbian Written Texts: An Overview of Resources and Basic Tools
نویسندگان
چکیده
In this paper we describe the resources and tools for the processing of texts written in Serbian that have been developed within the University of Belgrade NLP group located at the Faculty of Mathematics. The main features of these resources, namely available monolingual and multilingual corpora and various e-dictionaries are briefly described. The use of Intex, the main tool of the NLP group, for the recognition of unknown words, text tagging, building local grammars and disambiguation is outlined.
منابع مشابه
An Overview of Resources and Basic Tools for the Processing of Serbian Written Texts
In this paper we describe the resources and tools for the processing of texts written in Serbian. Most of the resources have been developed within the University of Belgrade NLP group located at the Faculty of Mathematics. The main features of these resources, namely available monolingual and multilingual corpora and various e-dictionaries are briefly described. The use of Intex, the main tool ...
متن کاملAutomatic Recognition of Composite Verb Forms in Serbian
In this paper, we will present the work on building a shallow parser for recognizing composite verb forms in Serbian – the forms that consist of an auxiliary verb and a main verb. The parser is made in Unitex, a corpus processing software, in the form of local grammars that rely on using morphological dictionaries of Serbian. The model was tested on a small corpus of texts, both written in Serb...
متن کاملMetadiscourse Markers Revisited in EFL Context: The Case of Iranian Academic Learners’ Perception of Written Texts
Moving in line with the postulation that metadiscourse (MD) markers help transform a dry and tortuous piece of text into a coherent and reader-friendly one, the researchers in the current study attempted to investigate the effect different metadiscourse markers might have on Iranian EFL learners’ perception of written texts. To this end, 120 undergraduate English students were given three diffe...
متن کاملThe Extent of Using the Basic Vocabulary in the First Grade Quran Textbook
The Extent of Using the Basic Vocabulary in the First Grade Quran Textbook S. B. Alavi Moghaddam, Ph.D. Textbooks need to be written in such a way that their readers can understand the written texts. One way of ensuring this objective in first grade textbooks would be the use of basic vocabulary, as determined by Ne'matzadeh, et.al. (1384). Considering the importance of the Quran text...
متن کاملResources for Processing Hebrew
We describe work in progress whose main objective is to create a collection of resources and tools for processing Hebrew. These resources include corpora of written texts, some of them annotated in various degrees of detail; tools for collecting, expanding and maintaining corpora; tools for annotation; lexicons, both monolingual and bilingual; a rule-based, linguistically motivated morphologica...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006